Impact of on-chip network parameters on nuca cache performances

نویسندگان

  • Alessandro Bardine
  • Manuel Comparetti
  • Pierfrancesco Foglia
  • Giacomo Gabrielli
  • Cosimo Antonio Prete
چکیده

Non-uniform cache architectures (NUCAs) are a novel design paradigm for large last-level on-chip caches, which have been introduced to deliver low access latencies in wire-delay-dominated environments. Their structure is partitioned into sub-banks and the resulting access latency is a function of the physical position of the requested data. Typically, NUCA caches employ a switched network, made up of links and routers with buffered queues, to connect the different sub-banks and the cache controller, and the characteristics of the network elements may affect the performance of the entire system. This work analyses how different parameters for the network routers, namely cut-through latency and buffering capacity, affect the overall performance of NUCA-based systems for the single processor case, assuming a reference NUCA organisation proposed in literature. The entire analysis is performed utilising a cycle-accurate execution-driven simulator of the entire system and real workloads. The results indicate that the sensitivity of the system to the cut-through latency is very high, thus limiting the effectiveness of the NUCA solution, and that modest buffering capacity is sufficient to achieve a good performance level. As a consequence, in this work we propose an alternative clustered NUCA organisation that limits the average number of hops experienced by cache accesses. This organisation is better performing and scales better as the cut-through latency increases, thus simplifying the implementation of routers, and it is also more effective than another latency reduction solution proposed in literature (hybrid network).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On-Chip Networks: Impact on the Performance of NUCA Caches

Non Uniform Cache Architectures (NUCA) are a new design paradigm for large last-level on-chip caches and have been introduced to deliver low access latencies in wire-delay dominated environments. Their structure is partitioned into sub-banks and the resulting access latency is a function of the physical position of the requested data. Typically, NUCA caches make use of a switched network to con...

متن کامل

Way adaptable D-NUCA caches

Non-uniform cache architecture (NUCA) aims to limit the wire-delay problem typical of large on-chip last level caches: by partitioning a large cache into several banks, with the latency of each one depending on its physical location and by employing a scalable on-chip network to interconnect the banks with the cache controller, the average access latency can be reduced with respect to a traditi...

متن کامل

Reducing Sensitivity to NoC Latency in NUCA Caches

Non Uniform Cache Architectures (NUCA) are a novel design paradigm for large last-level on-chip caches which have been introduced to deliver low access latencies in wire-delay dominated environments. Typically, NUCA caches make use of a network-on-chip (NoC) to connect the different sub-banks and the cache controller. This work analyzes how different network parameters, namely hop latency and b...

متن کامل

3D Tree Cache – A Novel Approach to Non- Uniform Access Latency Cache Architectures for 3D CMPs

We consider a non-uniform access latency cache architecture (NUCA) design for 3D chip multiprocessors (CMPs) where cache structures are divided into small banks interconnected by a network-on-chip (NoC). In earlier NUCA designs, data is placed in banks either statically (S-NUCA) or dynamically (D-NUCA). In both SNUCA and D-NUCA designs, scaling to hundreds of cores can pose several challenges. ...

متن کامل

NUCA Caches: Analysis of Performance Sensitivity to NoC Parameters

The on-chip network (NoC) is a fundamental component of Non Uniform Cache Architectures and may significantly affect the performance of the overall system. The analysis described in this work evaluates the performance sensitivity of a single processor system adopting a NUCA L2 cache with respect to some NoC parameters, namely the hop latency and the buffering capacity of routers. The results sh...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IET Computers & Digital Techniques

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2009